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1.  PROGRESS  DURING  THE  REPORTING  PERIOD. 
1.1  Naval  Research  Laboratory  (NRL): 


The  CMS,  with  128  nodes,  was  installed  at  NRL  in  November  of  1992.  In 
late  December,  the  upgrade  to  256  nodes  and  48  drives  of  Scalable  Disk 
Array  (1  Gigabyte  each)  was  begun.  NRL  is  awaiting  native  HIPPL  In 
the  late  winter  die  first  few  users  started  using  the  machine.  Now 
there  are  over  100  users  with  accounts  on  the  CMS.  We  discuss  some  of 
the  projects  and  their  results  below. 

Inertial  Confinement  /  Laser  Fusion: 

Jill  P.  Dahlburg,  John  H.  Gardner,  David  E.  Fyfe  (NRL) 
and  external  collaborators 

This  group  has  been  woriring  on  a  simulating  the  full  nonlinear 
evolution  of  a  3  dimensional  Rayleigh-Taylor  instability.  Their  goal 
is  "to  obtain  predictive  capability  of  how  the  presence  of  many 
RT-unsta-  ble  modes  affect  RT  single-mode  saturation  andshape 
effects,  including  finite-  thickness  target  information  like  die 
target  mass  ratio  rhoR  min  /  rhoR  max  (what  experimentalists  can 
measure)  and  local  minimum  values  of  the  mass  integral,  rhoR  (of 
primary  interest  for  target  design)".  The  fast  processing  and  large 
memory  of  the  CM-5  have  allowed  them  to  implement  the  table  look-ups 
inherent  in  the  real  equation  of  state  and  the  variable  Eddington 
multigroup  radiation  transport  calculations.  The  efficiency  of 
CM ..FORTRAN  library  routines  minimizes  memory  usage  as  well  as 
promoting  parallel  efficiency.  The  work  has  been  presented  at  the 
23rd  Anomalous  Absorption  Conference  and  submitted  to  the  1993 
American  Physical  Society/Division  of  Plasma  Physics  Meeting. 

Relevant  publications  indude: 

J.P.Dahlburg,  J.KGardner,  S.W.Haan,  &  G.D.Doolen,  Phys.Fluids 
B,  vol  5,  571  (1993). 

J.P.Dahlburg,  J.H.Gardner,  D.E.Fyfe,  S.W.Haan,  &  G.D.Doolen, 

(in  prep.,  1993). 

J.P.Dahlburg.  &  JJH.Gardner,  Bull.Am.Phys.Soc.  yol  37.  1471,  (1992). 


Weather  Prediction: 

Paul  Anderson,  Michael  Young,  Joseph  Bradley,  David  Norton, 
Peter  Caress  (NRL),  Joseph  Sela  (National  Metorlogical 
Center,  National  Weather  Service) 

This  group  has  been  been  converting  the  NOAA  National  Weather  Service 
global  weather  forecast  model  from  a  Cray  Y-MP/8  version  to  a 
Connection  Machine  version.  The  code  does  not  yet  run  cm  the  CM-5 
(because  of  a  dependence  on  die  CMSSL  FFT  routine)  but  it  is  expected 
that  performance  on  the  CM-5  will  ultimately  be  good. 

A  similar  conversion  of  the  Navy  global  spectral  weather  model  is  also 
underway.  Activity  to  date  has  centered  on  conversion  of  the  Navy 
atmospheric  physics  modules. 

Details  on  their  approach  to  spectral  to  grid  conversion  have  been 
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documented  in  a  paper  submitted  for  a  special  issue  of  the  Journal  of 
Parallel  ami  Distributed  Computing. 

High  t_c  Superconducting: 

J.W.  Serene  and  D.W.  Hess  (NRL) 

Using  both  the  CM-200  and  CM-5,  they  are  studying  Anderson  lattice 
models  and  Hubbard  models,  which  are  believed  to  contain  the  essential 
physics  for  understanding  the  role  of  the  strong  interactions  between 
electrons  in  certain  rare  earth  and  actinide  compounds  called  heavy 
electron  systems,  and  in  the  high  temperature  superconductors  and 
related  transition  metal  compounds.  They  have  developed  a  massively 
parallel  code  that  allows  them  to  solve  efficently  the  set  of 
nonlinear  coupled  equations  for  the  self  energy  in  what  is  known  as  the 
fluctuation  exchange  approximation. 

Space  Surveillance:  • 

Liam  Healy  (NRL) 

This  project  is  concerned  with  tracking,  cataloging  and  analyzing 
die  orbital  motion  of  more  than  7000  earth-orbiting  objects.  Orbit 
propagation,  that  is,  finding  a  satellite's  future  position  and  velocity 
from  its  current  position  and  velocity,  is  an  essential  component  of  all 
aspects  of  the  computing.  Parallelizing  with  one  satellite  per 
virtual  processor,  there  is  no  communication  needed,  because  all 
satellites  evolve  independently  of  one  another.  Close  conjunction 
determination,  finding  pairs  of  satellites  that  come  within  a  certain 
distance  of  another  at  some  point  during  die  orbit,  does  involve 
communication  but  is  much  faster  than  on  a  serial  computer.  These  are 
now  running  on  the  CM-5  and  provide  capability  for  studies  impossible 
before.  He  is  now  working  towards  implementation  of  other  spsoe 
surveillance  tasks  such  as  orbit  determination,  trade  corteHation  and 
object  identification. 

Finite-Volume  Magnetohydrodynamics  Codes: 

Rick  Devore  (NRL)  _ _  . 

Researchers  are  implementing  a  pair  of  2.5-dimensional  (2-dimensional 
spatial  variations,  3-dimensional  vector  fields)  MHD  models  using 
flux-corrected  transport  (FCT)  finite- volume  techniques  on  the  CM-5. 
Efforts  to  date  have  focused  on  developing  and  optimizing  the  core 
FCT  modules  used  to  time-advance  the  generalized  continuity  and 
hydromagnetic  equations  of  the  models.  The  modules  are  written  in  CM 
Fortran  and  have  been  tested  on  a  simple  2-dimensional  blast  wave 
problem  using  computational  meshes  of  various  sizes.  TMC's  Prism 
development  tool  has  been  used  to  debug  and  gather  performance 
statistics  from  these  tests,  on  both  the  CM-5  and  the  CM-200. 

Preliminary  experiments  with  the  CM-5  suggest  that  we  can  expect  about 
700  MFlops  sustained  speed  on  a  256x256  grid,  compared  with  the  150 
MFlops  obtained  on  a  single-processor  Cray  Y-MP.  This  reduces  the 
computation  time  for  a  typical  simulation  from  24  hours  to  about  5 
hours.  On  a  1024x1024  grid,  the  communications  penalty  is  smaller  and 
about  3  GFlops  can  be  attuned.  This  much  better  resolved  calculation 


would  require  about  3  days  on  the  CM-5,  compared  with  some  2  months  on 
the  Y-MP.  The  newly  released  ran  time  system  for  CM  Fortran  should 
substantially  improve  the  performance  on  the  256x236  problem,  whose 
compute  lime  is  overwhelmingly  dominated  by  the  required 
communications  (circular  shifts).  It  is  hoped  that  future  releases  of 
the  Fortran  compiler  also  will  be  able  to  take  better  advantage  of  the 
massive  parallelism  inherent  in  these  codes  and  achieve  more  efficient 
use  (than  the  present  20%)  of  die  fast  vector  units. 

High  Speed  Combustion  Flows: 

Elaine  S.  Oran  (NRL),  Robert  Whaley  (TMC)  and  external 
collaborators 

A  computational  fluid  dynamics  (CFD)  code  capable  of  simulating  unsteady, 
compressible  reactive  flows  has  been  developed  on  the  massively  parallel 
Connection  Machine.  The  parallel  CFD  code,  which  was  written  in  CM  Fortran, 
has  been  used  to  simulate  multidimensional  detonation  waves  and  to  examine 
the  suitability  of  parallel  computer  architectures  for  computing  reactive 
flows.  It  has  been  found  that  the  Connection  Machine  is  a  good  platform 
for  simulating  unsteady,  inviscid  compressible  flows,  however  efficient 
integration  of  the  chemical  rate  equations  on  a  parallel  computer  sometimes 
requires  additional  programming  to  properly  manage  the  processor  loads. 

The  flow  behind  a  shockwave  propagating  into  a  hydrogen-oxygen  gas  has 
been  simulated  on  a  CM-2,  CM-200,  and  a  CM-5  using  the  same  data  parallel 
program. 

In  a  reactive  flow  simulation,  integrating  die  chemical  species  rate 
equations  on  a  parallel  computer  architecture  was  found  to  be 

inefficient  if  the  processor  loads  were  not  actively  managed.  . 

Separate  routines  were  written  to  balance  the  load  among  the  multiple 
processors  to  achieve  efficient  utilization  of  die  Connection  Machine. 

The  new  load  balancing  algorithm  is  based  on  CM  Fortran  Library 
functions  which  create  send  addresses  and  perform  gather/scatter 
operations  on  the  unconverged  grid  points.  Significant  speedups  are 
realized  by  balancing  the  chemistry  integration  such  that  the 
percentage  of  useful  work  conducted  is  maximized.  The  data  parallel 
structure  and  implementation  of  CM  Fortran  functions  provides  a  load 
balancing  routine  that  is  portable  between  the  different  architectures 
available  in  the  CM-200  and  CM-5. 

1.2  NASA  Ames  Research  Center. 

The  CM-5  system  was  installed  at  NASA  Ames  in  the  first  half  of 
January,  1993.  The  system  was  brought  to  user  community  during  the 
second  half  of  January.  Average  availably  of  the  system  has  been 
approximately  95%,  and  usage  is  more  than  50%  of  a  24  hour  shift  The 
CM-5  configuration  at  NASA  Ames  consists  of  128  nodes  with  vector 
units,  4  Control  Processors  and  1  I/O  Control  Processor,  and  an  SDA 
with  48  drives.  Three  HIPPI  channels  and  crossbar  are  expected  to 
arrive  in  October.  NASA  Ames  has  also  served  as  a  beta  site  for  new 
releases  of  many  software  systems,  e.g.,  CMOST,  CMFortran,  C*.  PNDBX, 
andCMAX. 


The  Ames  user  community  began  to  explore  MIMD  fetaures  of  new  CM-5 
system.  Tim  Barth  a,  id  Sam  Linton  are  in  the  process  of  im  piemen tadng 
a  turbomachinery  unstructured  code  on  CMS  (this  code  ran  before  on 
the  Intel  iPSC/860). 

Work  was  also  begun  in  a  new  direction  of  parallel  scientific 
visualization.  The  CM  AVS  package  was  installed  and  tested  rigorously 
at  Ames.  A  new  program  which  will  visualize  Saturn's  ring  is  in  the 
final  stages  of  testing.  The  source  of  data  is  a  parallel  code  which 
simulates  Saturn's  planetary,  moons,  and  particles  system  (Creon  Levit  and 
Space  Science  Division).  Hie  visualization  (by  Arsi  Vaziri  and  Mark 
Kremenetsky)  is  based  on  CM  AVS  software. 

A  number  of  major  data  parallel  projects  which  used  to  run  on  the  CM-2 
system  have  been  successfully  reimplemented  on  die  CMS. 

CM3D:  Dennis  Jespersen,  Creon  Levit  (NASA) 

A  compressible  Navier-Stokes  solver  for  use  on  multiple  overlapping 
three-dimensional  structured  curvilinear  grids.  This  code  is  a  core 
for  developing  a  production  code  for  use  by  the  United  States 
aerospace  industry. 

AMESCEM:  Michael  J.  Shuh  (NASA) 

This  is  a  three-dimensional  finite-volume  time-domain  electromagnetic 
code(FVTD)  code.  It  is  run  on  multiple  block  curvilinear  grids  so  as 
to  predict  scattering  from  complex  objects . 

DNS  (Direct-Navier-Stokes-Simulation):  Nateri  Madavan  (NASA) 

The  objective  is  to  perform  direct  numerical  simulations  of 
spatially-evolving  compressible  turbulence  using  high-order-accurate 
finite-difference  techniques.  The  development  of  a  DNS  code  is  aimed 
at  providing  accurate  turbulent  inflow  boundary  conditions  for  use  in 
spatial  simulations  of  transition  and  . turbulence.  This  code  currently 
achieves  about  750  MFLOPS  performance  in  double  precision  on  a  32K 
CM-2  and  128  node  CM-5.  This  code  requires  so  much  memory  that  it  can 
be  implemented  only  on  CM.  The  CRAY  YMP  works  as  preprocessor  for  the 
CM-5  in  this  case. 

RANS  (Reynolds- Averaged-Navier-Stokes):  Nateri  Madavan  (NASA) 

The  focus  of  this  project  is  on  the  highly  compute-intensive  end  of 
die  CFD  application  spectrum.  RANS  equations  are  solved  in  an 
implicit,  time-accurate  manner,  using  upwind  schemes  and  zonal 
mythologies  representative  of  current  state-of-art  in  CFD. 

The  major  impetus  for  this  research  is  a  growing  belief  that  MPP  holds  the 
key  to  developing  future  teraflop  capability  and  the  potential  for  meeting 
computer  performance  requirements  of  large  scale  scientific  simulations. 

PSICM:  Leonardo  Dagura  (NASA) 

The  objectives  erf  this  project  are  to  accurately  describe  high 


altitude  plume  interaction  phenomena  and  accurately  simulate  expanding 
flows  starting  either  at  the  nozzle  exit  plane,  or  at  a  supplied 
starting  surface.  The  core  of  this  code  is  based  on  a  direct 
simulation  Monte  Carlo  method.  Using  a  starting  surface  obtained  from 
a  Method  of  Characteristics  solution  for  an  Orbiter  reaction  control 
system  (RCS)  engine  plum,  the  code  demonstrated  the  existence  of  a 
plume/plume  self-interaction  shock  for  two  engines  separated  by  69 
feet  The  self-interaction  shock  is  a  complicated  three-dimensional 
structure  and  die  calculation  required  the  large  memory  and 
performance  of  die  Connection  Machine  to  be  completed  in  a  reasonable 
amount  of  time.  The  implication  of  a  self-interaction  shock  for 
separation  distances  of  60  feet  is  highly  releveant  to  the  space 
station  design. 

Computational  Fluid  Dynamics  on  Connection  Machine: 

Horst  D.  Simon  (NASA), 

Mark  D.  Kremenetsky,  John  L.  Richardson  (TMC) 

A  two-dimensional  implicit  Navier-Stokes  parallel  procedure  for  an 
application  to  a  compressible  turbulent  flow  was  developed  along  with 
the  necessary  parallel  preconditioners  and  solvers.  The 
preconditioning  phase  is  crucial  for  the  convergence  of  the  developed 
procedures^nd  an  approach  to  preconditioning  for  very  large  block 
banded  unsymmetric  linear  systems  based  on  computing  of  an  approximate 
inverses  to  an  original  system  was  used.  The  algorithm  exhibits  a 
natural  parallelism  which  can  be  effectively  exploited  on  massively 
parallel  machines.  The  developed  methods  were  implemented  on  the 
Connection  Machines  (CM-2  and  CM-5)  using  the  CM  Fortran  language. 
Relevant  publications  include: 

M.D.Kremenetsky,  J.L.Richardson  "  A  Parallel  Unfactored  Solver 
for  Computational  Fluid  Dynamics,"  Proceedings  of 
Parallel  CFD  *92  Conference,  New  Brunswick,  NJ,  May  1992 
M.Grote,  H.D.Siroon.,  "Parallel  Preconditioning  and  Approximate 
Inverses  on  the  Connection  Machine,"  Proceeding  of  the 
Scalable  High  Performance  Computing  Conference  (SHPCC) 

1992.  IEEE  Computer  Society  Press,  Los  Alamitos. 

CA  1992,  pp.76-89 


2.  PLANNED  ACITVmES  FOR  NEXT  REPORTING  PERIOD 

Continuation  of  projects  discussed  in  1.1  and  1.2.  See  above  for 
additional  information. 


3.  MAJOR  EXPERIMENTAL  OR  SPECIAL  EQUIPMENT  PURCHASED 
-  nothing  - 


4.  CHANGES  IN  KEY  PERSONNEL 
-  none  - 


5.  INFORMATION  DERIVED  FROM  MEETINGS  AND  CONFERENCES 

-nothing  to  report- 

6.  SUMMARY  OF  PROBLEMS  OR  AREAS  OF  CONCERN 

-  nothing  to  report  - 

7.  RELATED  ACCOMPLISHMENTS  SINCE  LAST  REPORT 


-  initial  report  - 


